Spectral Learning of Binomial HMMs for DNA Methylation Data

نویسندگان

  • Chicheng Zhang
  • Eran A. Mukamel
  • Kamalika Chaudhuri
چکیده

We consider learning parameters of Binomial Hidden Markov Models, which may be used to model DNA methylation data. The standard algorithm for the problem is EM, which is computationally expensive for sequences of the scale of the mammalian genome. Recently developed spectral algorithms can learn parameters of latent variable models via tensor decomposition, and are highly efficient for large data. However, these methods have only been applied to categorial HMMs, and the main challenge is how to extend them to Binomial HMMs while still retaining computational efficiency. We address this challenge by introducing a new feature-map based approach that exploits specific properties of Binomial HMMs. We provide theoretical performance guarantees for our algorithm and evaluate it on real DNA methylation data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hilbert Space Embeddings of Hidden Markov Models

Hidden Markov Models (HMMs) are important tools for modeling sequence data. However, they are restricted to discrete latent states, and are largely restricted to Gaussian and discrete observations. And, learning algorithms for HMMs have predominantly relied on local search heuristics, with the exception of spectral methods such as those described below. We propose a nonparametric HMM that exten...

متن کامل

Predicting CpG Islands and DNA Methlation in the Cow Genome Using DNA Microarray Meta-Analysis and Genome Wide Scanning

DNA methylation is a type of epigenetic changes that directly affects DNA. In mammals, DNA methylation is essential for fetal development and stem cell differentiation and this phenomenon essentially occurs within the CpG islands. In this study, two methods were used to study the DNA methylation profile of cow genome. In the first method, the DNA methylation profile of the differentially expres...

متن کامل

Implementing spectral methods for hidden Markov models with real-valued emissions

Hidden Markov models (HMMs) are widely used statistical models for modeling sequential data. The parameter estimation for HMMs from time series data is an important learning problem. The predominant methods for parameter estimation are based on local search heuristics, most notably the expectation–maximization (EM) algorithm. These methods are prone to local optima and oftentimes suffer from hi...

متن کامل

The role and importance of DNA methylation in spermatogenesis process

Background: DNA methylation is one of the epigenetic marks that are created by de novo DNA methylation and be maintained through cell division. This process is catalyzed by DNA methyltransferases. DNA methylation establishment in germ line is important, since they have the potential to regulate gene expression in offspring and improper DNA methylation patterns in germ lines has serious conseque...

متن کامل

A Stochastic Model for the Formation of Spatial Methylation Patterns

DNA methylation is an epigenetic mechanism whose important role in development has been widely recognized. This epigenetic modification results in heritable changes in gene expression not encoded by the DNA sequence. The underlying mechanisms controlling DNA methylation are only partly understood and recently different mechanistic models of enzyme activities responsible for DNA methylation have...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.02498  شماره 

صفحات  -

تاریخ انتشار 2018